Data Preprocessing

  1. First, I wanted to investigate which variables are time dependent and also exclude some that were clearly unnecessary (i.e., “SITE”,“COLPROT”,“ORIGPROT”, “FLDSTRENG”,“FSVERSION”,“IMAGEUID”, “Month_bl”,“Month”,“M”,“update_stamp”).

  2. Merge time dependent and independent variables into the long_dat data frame. Also, I recoded the time points in the VISCODE variable into integers.

long_dat <- dat[, c(ivars[,1], nivars[,1])] %>%
  mutate(VISCODE = match(VISCODE, c("bl", "m03", "m06", "m12", "m18", "m24", 
                                    "m30","m36", "m42", "m48", "m54", "m60", 
                                    "m66", "m72","m78", "m84", "m90", "m96", 
                                    "m102", "m108","m114", "m120", "m126", 
                                    "m132", "m144", "m156"))-1) %>%
  relocate(RID, PTID, VISCODE) %>%
  arrange(RID, VISCODE)
  1. In the original data frame there were quite some _bl or _BL variables. Thus, I wanted to check whether these columns had already been integrated or not at each corresponding time point for each participant. Surprise, the test was negative.

  2. Therefore, I continued with merging the _bl/_BL variables with the corresponding time dependent variable for each participant. Additionally, I specified the data type of each variable individually for optimal control and oversight over the data structure.

  3. Transform Long to Wide Data Format

## # A tibble: 6 × 1,153
##   RID   PTID         AGE PTGENDER PTEDUCAT PTETHCAT PTRACCAT PTMARRY APOE4 FDG_0
##   <fct> <chr>      <dbl> <fct>       <int> <fct>    <fct>    <fct>   <int> <dbl>
## 1 2     011_S_0002  74.3 Male           16 Not His… White    Married     0  1.37
## 2 3     011_S_0003  81.3 Male           18 Not His… White    Married     1  1.08
## 3 4     022_S_0004  67.5 Male           10 Hisp/La… White    Married     0 NA   
## 4 5     011_S_0005  73.7 Male           16 Not His… White    Married     0  1.29
## 5 6     100_S_0006  80.4 Female         13 Not His… White    Married     0 NA   
## 6 7     022_S_0007  75.4 Male           10 Hisp/La… More th… Married     1 NA   
## # ℹ 1,143 more variables: FDG_2 <dbl>, FDG_7 <dbl>, FDG_11 <dbl>, FDG_12 <dbl>,
## #   FDG_13 <dbl>, FDG_14 <dbl>, FDG_15 <dbl>, FDG_16 <dbl>, FDG_17 <dbl>,
## #   FDG_18 <dbl>, FDG_19 <dbl>, FDG_21 <dbl>, FDG_22 <dbl>, FDG_23 <dbl>,
## #   FDG_24 <dbl>, FDG_3 <dbl>, FDG_4 <dbl>, FDG_5 <dbl>, FDG_6 <dbl>,
## #   FDG_9 <dbl>, FDG_8 <dbl>, FDG_10 <dbl>, FDG_25 <dbl>, FDG_20 <dbl>,
## #   FDG_1 <dbl>, PIB_0 <dbl>, PIB_2 <dbl>, PIB_7 <dbl>, PIB_11 <dbl>,
## #   PIB_12 <dbl>, PIB_13 <dbl>, PIB_14 <dbl>, PIB_15 <dbl>, PIB_16 <dbl>, …

Age Distribution in Data Frame

Attrition Analysis

Based on the number of participants measured at any time point I made a frequency plot to get a first idea of the sampling frequency.

Domains

Demographics

Cognitive Tests

Biomedical Imaging

Biomarkers

Based on these findings it appears that time point 9 is a cut-off where the number of measurements drop quite strongly. Time point 9 corresponds to month 42 (i.e., 3.5 years) of the follow-up.

Polygenic Risk Score for Educational attainment

The merge(by.x, by.y) function creates a new data frame that only keeps those rows for which there is a matching key (in our case PTID). Therefore, we do have genetic data from 2 additional individuals for which we do not have any other measurements. The final data frame for which testing data and genetic data is available is thus, 1408 (N).

Plot PGS EA vs. Actual EA

Based on this plot, we can see a positive relationship between the polygenic score for education attainment and actual years of education. This means that with a higher PGS score comes higher genetic capacity for educational attainment.

We ran Pearson’s correlation which resulted in r = 0.286 (p-value < 2.2e-16)

Check linear regression assumptions

# Linear regression model
model <- lm(PTEDUCAT~EA22+AGE+PTGENDER,data=long_dat)

# Check model assumptions
check_model(model)

Create Residuals

To get the residual we regressed the polygenic risk score for educational attainment against actual EA including the variables SEX & AGE as covariates. The results are depicted in the density plot.

How to interpret the Residuals?

It is important to correctly interpret the residual scores. The correct way to interpret them is, that a high residual score means that the individual has over-performed relative to his or her genetic capacity. See for example in this table for a short proof:

##   Actual Predicted  Residuals
## 1     18  16.91911  1.0808864
## 2     16  15.16815  0.8318481
## 3     12  16.64336 -4.6433625
## 4     20  16.02560  3.9743989
## 5     14  14.83958 -0.8395765
## 6     13  15.37284 -2.3728412

Survival Analysis

Using the ntile function from dplyr, the lower tertile will be assigned value 1 (~ negative residual), middle tertile value 2 and upper tertile value 3 (~positive residual). The time-point is limited to the 9th follow-up (i.e., 48 months)

Mini-Mental State Examination (MMSE)

“The mini–mental state examination (MMSE) is a 30-point questionnaire that is used extensively in clinical and research settings to measure cognitive impairment. It is commonly used in medicine and allied health to screen for dementia. It is also used to estimate the severity and progression of cognitive impairment and to follow the course of cognitive changes in an individual over time; thus making it an effective way to document an individual’s response to treatment.Administration of the test takes between 5 and 10 minutes and examines functions including registration (repeating named prompts), attention and calculation, recall, language, ability to follow simple commands and orientation. […] Any score of 24 or more (out of 30) indicates a normal cognition. Below this, scores can indicate severe (≤9 points), moderate (10–18 points) or mild (19-23 points) cognitive impairment.” (Wikipedia.org). The MMSE scores were normalized using the NormPsy package and then the cut-off was calculated.

Boxplots of MMSE by Age Group at Baseline

To see if it is necessary to stratify for age groups effect of polygenic risk score for EA and age group was tested using linear regression. The results are displayed below.

## 
## Call:
## lm(formula = MMSE ~ EA22 + Age_Group, data = long_dat)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -27.415  -1.189   1.109   2.458   3.473 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 27.12540    0.16657 162.844  < 2e-16 ***
## EA22         0.45636    0.11384   4.009 6.19e-05 ***
## Age_Group   -0.01018    0.06723  -0.151     0.88    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 3.642 on 4822 degrees of freedom
##   (6151 observations deleted due to missingness)
## Multiple R-squared:  0.003326,   Adjusted R-squared:  0.002912 
## F-statistic: 8.045 on 2 and 4822 DF,  p-value: 0.0003249
## 
## Call:
## lm(formula = MMSE_norm ~ EA22 + Age_Group, data = long_dat)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -77.457 -14.481   1.461  21.439  29.791 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  75.4231     0.9389  80.330  < 2e-16 ***
## EA22          3.7284     0.6416   5.811 6.62e-09 ***
## Age_Group    -0.1925     0.3790  -0.508    0.611    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 20.53 on 4822 degrees of freedom
##   (6151 observations deleted due to missingness)
## Multiple R-squared:  0.006955,   Adjusted R-squared:  0.006543 
## F-statistic: 16.89 on 2 and 4822 DF,  p-value: 4.923e-08

MMSE Survival Analysis

Next the survival analysis was conducted for the genetic capacity of educational attainment and the residual educational attainment.

The difference between the high and low capacity/residual Educational attainment are 3.354398^{-5} and 4.5572169^{-23} respectively. In the next step the surivival analysis was conducted stratified for genetic capacity for educational attainment.

Log Test Low: 7.2954207^{-10}
Log Test Middle: 1.5611391^{-16}
Log Test High:3.2760959^{-5}

Alzheimer’s Disease Assessment Scale

The Cognitive Subscale Alzheimer’s Disease Assessment Scale (ADAS) is made of 11 tasks that include both subject-completed tests and observer-based assessments, assessing the memory, language, and praxis domains. The result is a global final score ranging from 0 to 70, based on the sum of the scores of the single tasks (ADAS11).

Beyond the ADAS11 score, the ADNI study included also an additional test of delayed word recall and a number cancellation or maze task, which are further summed to have a new total score that ranges from 0 to 85 (ADAS13).

In addition, the score of the task 4 (Word Recognition, ADASQ4) was included in the ADNIMERGE dataset.

(Grassi et al., 2019)

ADAS11

## Call:
## survdiff(formula = Surv(as.integer(VISCODE), ADAS11_cut) ~ thirtile_res, 
##     data = .)
## 
## n=5540, 5 observations deleted due to missingness.
## 
##                   N Observed Expected (O-E)^2/E (O-E)^2/V
## thirtile_res=1 2799     1198     1008      36.0      81.7
## thirtile_res=3 2741      800      990      36.6      81.7
## 
##  Chisq= 81.7  on 1 degrees of freedom, p= <2e-16

ADAS13

“The ADAS13 was included as a global measure of cognitive function. ADAS13 is a test battery developed to assess severity of cognitive impairment associated with AD and includes subtests and clinical evaluations assessing memory function, reasoning, language function, orientation and praxis. The ADAS13 is a modified version of the original ADAS-Cog-11, adding a cancellation task and a delayed free recall task. The higher the scores, the more severe impairment of cognitive function.” (Mofrad et al., 2021)

## Call:
## survdiff(formula = Surv(as.integer(VISCODE), ADAS13_cut) ~ thirtile_res, 
##     data = .)
## 
## n=7290, 27 observations deleted due to missingness.
## 
##                   N Observed Expected (O-E)^2/E (O-E)^2/V
## thirtile_res=1 3649     1619     1354      51.8       111
## thirtile_res=3 3641     1095     1360      51.6       111
## 
##  Chisq= 111  on 1 degrees of freedom, p= <2e-16

ADASQ4

## Call:
## survdiff(formula = Surv(as.integer(VISCODE), ADASQ4_cut) ~ thirtile_res, 
##     data = .)
## 
##                   N Observed Expected (O-E)^2/E (O-E)^2/V
## thirtile_res=1 3659     1492     1273      37.8      80.2
## thirtile_res=3 3658     1067     1286      37.4      80.2
## 
##  Chisq= 80.2  on 1 degrees of freedom, p= <2e-16

CDRSB

“The clinical dementia rating (CDR) scale is commonly used to diagnose dementia due to Alzheimer’s disease (AD). The sum of boxes of the CDR (CDR-SB) has recently been emphasized and applied to interventional trials for tracing the progression of cognitive impairment (CI) in the early stages of AD.” (Tzeng et al., 2022)

See Table 3 for explanation on the staging category (O’Bryant et al., 2012)

## Call:
## survdiff(formula = Surv(as.integer(VISCODE), CDRSB_cut) ~ thirtile_res, 
##     data = .)
## 
##                   N Observed Expected (O-E)^2/E (O-E)^2/V
## thirtile_res=1 3659     2779     2566      17.6      39.8
## thirtile_res=3 3658     2384     2597      17.4      39.8
## 
##  Chisq= 39.8  on 1 degrees of freedom, p= 3e-10

DIGITSCORE

“The DSST (Digit Symbol Substitution Test) is a paper-and-pencil cognitive test presented on a single sheet of paper that requires a subject to match symbols to numbers according to a key located on the top of the page. The subject copies the symbol into spaces below a row of numbers. The number of correct symbols within the allowed time, usually 90 to 120 seconds, constitutes the score.” (Jaeger, 2018) The lower the scores, the more severe impairment of cognitive function.

## Call:
## survdiff(formula = Surv(as.integer(VISCODE), DIGITSCOR_cut) ~ 
##     thirtile_res, data = .)
## 
## n=4029, 3288 observations deleted due to missingness.
## 
##                   N Observed Expected (O-E)^2/E (O-E)^2/V
## thirtile_res=1 2146      455      367      20.9        45
## thirtile_res=3 1883      251      339      22.7        45
## 
##  Chisq= 45  on 1 degrees of freedom, p= 2e-11

FAQ

The Functional Activities Questionnaire is used to assess an individual’s functional abilities in daily living activities. It is a caregiver-based questionnaire that helps evaluate how well a person is able to perform various instrumental activities of daily living (IADLs) and basic activities of daily living (ADLs). (ChatGPT) Sum scores (range 0-30). The score range for each item is 0–3 (higher scores indicate greater impairment; 0 = normal or never did but could do now; 1 = has difficulty but does by self or never did but would have difficulty now; 2 = requires assistance; 3 = dependent). There is no established cut-off score for IADL impairment on the FAQ. However, one study reported that a total FAQ score (sum of all 10 item scores; range 0–30) of ≥ 6 is suggestive of functional impairment [ 20]. (Marshall et al., 2015)

## Call:
## survdiff(formula = Surv(as.integer(VISCODE), FAQ_cut) ~ thirtile_res, 
##     data = .)
## 
## n=7308, 9 observations deleted due to missingness.
## 
##                   N Observed Expected (O-E)^2/E (O-E)^2/V
## thirtile_res=1 3659     1281     1090      33.5      70.9
## thirtile_res=3 3649      908     1099      33.2      70.9
## 
##  Chisq= 70.9  on 1 degrees of freedom, p= <2e-16

LDELTOTAL

## Call:
## survdiff(formula = Surv(as.integer(VISCODE), LDELTOTAL_cut) ~ 
##     thirtile_res, data = .)
## 
##                   N Observed Expected (O-E)^2/E (O-E)^2/V
## thirtile_res=1 3659     2017     1587       117       251
## thirtile_res=3 3658     1174     1604       115       251
## 
##  Chisq= 251  on 1 degrees of freedom, p= <2e-16

MOCA

Reference literature: doi: 10.1111/j.1532-5415.2005.53221.x

## Call:
## survdiff(formula = Surv(as.integer(VISCODE), MOCA_cut) ~ thirtile_res, 
##     data = .)
## 
## n=3896, 3421 observations deleted due to missingness.
## 
##                   N Observed Expected (O-E)^2/E (O-E)^2/V
## thirtile_res=1 1819     1348     1277      3.97      8.91
## thirtile_res=3 2077     1310     1381      3.67      8.91
## 
##  Chisq= 8.9  on 1 degrees of freedom, p= 0.003

Rey-Auditory Verbal Learning Test (RAVLT)

The RAVLT was included as a measure of memory function. In this test, the participants are asked to recall words from a list of 15 nouns immediately after each of five learning trials and after a short and a long delay. Two measures known to be sensitive to cognitive changes in patients with AD were included in the present study: Immediate recall (RAVLT-Im): the number of correct responses across the immediate recall of the five learning trials; percent forgetting (RAVLT-PF): the score on the fifth learning trial minus the score on the long delayed recall, divided by the score obtained on the fifth learning trial. The lower the scores, the more severe impairment of cognitive function.

Different summary scores are derived from raw RAVLT scores. These include RAVLT Immediate (the sum of scores from 5 first trials (Trials 1 to 5)), RAVLT Learning (the score of Trial 5 minus the score of Trial 1), RAVLT Forgetting (the score of Trial 5 minus score of the delayed recall) and RAVLT Percent Forgetting (RAVLT Forgetting divided by the score of Trial 5). We use naming of the ADNI merge table3 for these summary measures. We investigated the relationship between MRI measures and RAVLT cognitive test scores by estimating the RAVLT Immediate and RAVLT Percent Forgetting from the gray matter density. These two summary scores were selected since they highlight different aspects of episodic memory, learning (RAVLT Immediate) and delayed memory (RAVLT Percent forgetting), essential to AD and previous studies (Estévez-González et al., 2003, Wang et al., 2011, Gomar et al., 2014, Moradi et al., 2015) have indicated strong relationships between these two RAVLT measures and Alzheimer’s disease.

RAVLT Immediate

## Call:
## survdiff(formula = Surv(as.integer(VISCODE), RAVLT_immediate_cut) ~ 
##     thirtile_res, data = .)
## 
## n=7299, 18 observations deleted due to missingness.
## 
##                   N Observed Expected (O-E)^2/E (O-E)^2/V
## thirtile_res=1 3651     2612     2796      12.1      27.9
## thirtile_res=3 3648     3006     2822      12.0      27.9
## 
##  Chisq= 27.9  on 1 degrees of freedom, p= 1e-07

RAVLT Percentage Forgetting

## Call:
## survdiff(formula = Surv(as.integer(VISCODE), RAVLT_perc_forgetting_cut) ~ 
##     thirtile_res, data = .)
## 
## n=7290, 27 observations deleted due to missingness.
## 
##                   N Observed Expected (O-E)^2/E (O-E)^2/V
## thirtile_res=1 3642     1216     1072      19.4      40.8
## thirtile_res=3 3648      940     1084      19.2      40.8
## 
##  Chisq= 40.8  on 1 degrees of freedom, p= 2e-10

RAVLT Forgetting

## Call:
## survdiff(formula = Surv(as.integer(VISCODE), RAVLT_forgetting_cut) ~ 
##     thirtile_res, data = .)
## 
## n=7299, 18 observations deleted due to missingness.
## 
##                   N Observed Expected (O-E)^2/E (O-E)^2/V
## thirtile_res=1 3651       68     83.5      2.89      5.77
## thirtile_res=3 3648      100     84.5      2.85      5.77
## 
##  Chisq= 5.8  on 1 degrees of freedom, p= 0.02

RAVLT Learning

## Call:
## survdiff(formula = Surv(as.integer(VISCODE), RAVLT_learning_cut) ~ 
##     thirtile_res, data = .)
## 
## n=7299, 18 observations deleted due to missingness.
## 
##                   N Observed Expected (O-E)^2/E (O-E)^2/V
## thirtile_res=1 3651       39     44.2     0.608      1.21
## thirtile_res=3 3648       50     44.8     0.599      1.21
## 
##  Chisq= 1.2  on 1 degrees of freedom, p= 0.3

TRABSCORE

The Trail Making Test is a neuropsychological test of visual attention and task switching. It has two parts, in which the subject is instructed to connect a set of 25 dots as quickly as possible while maintaining accuracy.

The test can provide information about visual search speed, scanning, speed of processing, mental flexibility, and executive functioning. It is sensitive to cognitive impairment associated with dementia, including Alzheimer’s disease. (ChatGPT)

Record the total number of seconds to complete Part B (Trails B), up to a maximum of 300 seconds. If the participant is not finished by 300 seconds, the score is 300.

## Call:
## survdiff(formula = Surv(as.integer(VISCODE), TRABSCOR_cut) ~ 
##     thirtile_res, data = .)
## 
## n=7252, 65 observations deleted due to missingness.
## 
##                   N Observed Expected (O-E)^2/E (O-E)^2/V
## thirtile_res=1 3619      933      738      51.3       106
## thirtile_res=3 3633      554      749      50.6       106
## 
##  Chisq= 106  on 1 degrees of freedom, p= <2e-16

Patient’s Everyday Cognition (EcogPt)

The original version of the ECog is an informant-based measure of cognitively-relevant everyday abilities comprised of 39 items, covering six cognitively-relevant domains: Everyday Memory, Everyday Language, Everyday Visuospatial Abilities, and Everyday Planning, Everyday Organization, and Everyday Divided Attention. Ratings are made on a four-point scale: 1 = better or no change compared to 10 years earlier, 2 = questionable/occasionally worse, 3 = consistently a little worse, 4 = consistently much worse. (Tomaszewski Farias et al., 2012)

EcogPt Everyday Divided Attention

## Call:
## survdiff(formula = Surv(as.integer(VISCODE), EcogPtDivatt_cut) ~ 
##     thirtile_res, data = .)
## 
## n=3888, 3429 observations deleted due to missingness.
## 
##                   N Observed Expected (O-E)^2/E (O-E)^2/V
## thirtile_res=1 1820      194      222      3.44       6.7
## thirtile_res=3 2068      272      244      3.11       6.7
## 
##  Chisq= 6.7  on 1 degrees of freedom, p= 0.01

EcogPt Everyday Language

## Call:
## survdiff(formula = Surv(as.integer(VISCODE), EcogPtLang_cut) ~ 
##     thirtile_res, data = .)
## 
## n=3919, 3398 observations deleted due to missingness.
## 
##                   N Observed Expected (O-E)^2/E (O-E)^2/V
## thirtile_res=1 1830      309      273      4.63      9.11
## thirtile_res=3 2089      266      302      4.20      9.11
## 
##  Chisq= 9.1  on 1 degrees of freedom, p= 0.003

EcogPt Everyday Memory

## Call:
## survdiff(formula = Surv(as.integer(VISCODE), EcogPtMem_cut) ~ 
##     thirtile_res, data = .)
## 
## n=3925, 3392 observations deleted due to missingness.
## 
##                   N Observed Expected (O-E)^2/E (O-E)^2/V
## thirtile_res=1 1828      341      349     0.188      0.37
## thirtile_res=3 2097      396      388     0.169      0.37
## 
##  Chisq= 0.4  on 1 degrees of freedom, p= 0.5

EcogPt Everyday Organization

## Call:
## survdiff(formula = Surv(as.integer(VISCODE), EcogPtOrgan_cut) ~ 
##     thirtile_res, data = .)
## 
## n=3855, 3462 observations deleted due to missingness.
## 
##                   N Observed Expected (O-E)^2/E (O-E)^2/V
## thirtile_res=1 1787      299      319      1.26      2.48
## thirtile_res=3 2068      377      357      1.13      2.48
## 
##  Chisq= 2.5  on 1 degrees of freedom, p= 0.1

EcogPt Everyday Planning

## Call:
## survdiff(formula = Surv(as.integer(VISCODE), EcogPtPlan_cut) ~ 
##     thirtile_res, data = .)
## 
## n=3915, 3402 observations deleted due to missingness.
## 
##                   N Observed Expected (O-E)^2/E (O-E)^2/V
## thirtile_res=1 1828      442      415      1.74      3.48
## thirtile_res=3 2087      431      458      1.58      3.48
## 
##  Chisq= 3.5  on 1 degrees of freedom, p= 0.06

EcogPt Everyday Visuospatial Abilities

## Call:
## survdiff(formula = Surv(as.integer(VISCODE), EcogPtVisspat_cut) ~ 
##     thirtile_res, data = .)
## 
## n=3897, 3420 observations deleted due to missingness.
## 
##                   N Observed Expected (O-E)^2/E (O-E)^2/V
## thirtile_res=1 1827      351      335     0.719      1.43
## thirtile_res=3 2070      350      366     0.660      1.43
## 
##  Chisq= 1.4  on 1 degrees of freedom, p= 0.2

EcogPt Total ???

## Call:
## survdiff(formula = Surv(as.integer(VISCODE), EcogPtTotal_cut) ~ 
##     thirtile_res, data = .)
## 
## n=3919, 3398 observations deleted due to missingness.
## 
##                   N Observed Expected (O-E)^2/E (O-E)^2/V
## thirtile_res=1 1830      366      356     0.258     0.512
## thirtile_res=3 2089      384      394     0.234     0.512
## 
##  Chisq= 0.5  on 1 degrees of freedom, p= 0.5

Self-Reported Everyday Cognitive Abilities Questionnaire (EcogSP)

EcogSP Everyday Divided Attention

## Call:
## survdiff(formula = Surv(as.integer(VISCODE), EcogSPDivatt_cut) ~ 
##     thirtile_res, data = .)
## 
## n=3913, 3404 observations deleted due to missingness.
## 
##                   N Observed Expected (O-E)^2/E (O-E)^2/V
## thirtile_res=1 1843      632      574      5.76        12
## thirtile_res=3 2070      558      616      5.37        12
## 
##  Chisq= 12  on 1 degrees of freedom, p= 5e-04

EcogSP Everyday Language

## Call:
## survdiff(formula = Surv(as.integer(VISCODE), EcogSPLang_cut) ~ 
##     thirtile_res, data = .)
## 
## n=3989, 3328 observations deleted due to missingness.
## 
##                   N Observed Expected (O-E)^2/E (O-E)^2/V
## thirtile_res=1 1872      771      702      6.71      14.1
## thirtile_res=3 2117      683      752      6.27      14.1
## 
##  Chisq= 14.1  on 1 degrees of freedom, p= 2e-04

EcogSP Everyday Memory

## Call:
## survdiff(formula = Surv(as.integer(VISCODE), EcogSPMem_cut) ~ 
##     thirtile_res, data = .)
## 
## n=3989, 3328 observations deleted due to missingness.
## 
##                   N Observed Expected (O-E)^2/E (O-E)^2/V
## thirtile_res=1 1872      761      707      4.18      8.75
## thirtile_res=3 2117      702      756      3.90      8.75
## 
##  Chisq= 8.8  on 1 degrees of freedom, p= 0.003

EcogSP Everyday Organization

## Call:
## survdiff(formula = Surv(as.integer(VISCODE), EcogSPOrgan_cut) ~ 
##     thirtile_res, data = .)
## 
## n=3850, 3467 observations deleted due to missingness.
## 
##                   N Observed Expected (O-E)^2/E (O-E)^2/V
## thirtile_res=1 1791      532      505      1.42      2.92
## thirtile_res=3 2059      523      550      1.31      2.92
## 
##  Chisq= 2.9  on 1 degrees of freedom, p= 0.09

EcogSP Everyday Planning

## Call:
## survdiff(formula = Surv(as.integer(VISCODE), EcogSPPlan_cut) ~ 
##     thirtile_res, data = .)
## 
## n=3959, 3358 observations deleted due to missingness.
## 
##                   N Observed Expected (O-E)^2/E (O-E)^2/V
## thirtile_res=1 1855      701      621     10.23      21.4
## thirtile_res=3 2104      587      667      9.53      21.4
## 
##  Chisq= 21.4  on 1 degrees of freedom, p= 4e-06

EcogSP Everyday Visuospatial Abilities

## Call:
## survdiff(formula = Surv(as.integer(VISCODE), EcogSPVisspat_cut) ~ 
##     thirtile_res, data = .)
## 
## n=3953, 3364 observations deleted due to missingness.
## 
##                   N Observed Expected (O-E)^2/E (O-E)^2/V
## thirtile_res=1 1846      688      616      8.39      17.4
## thirtile_res=3 2107      602      674      7.67      17.4
## 
##  Chisq= 17.4  on 1 degrees of freedom, p= 3e-05

EcogSP Total ???

## Call:
## survdiff(formula = Surv(as.integer(VISCODE), EcogSPTotal_cut) ~ 
##     thirtile_res, data = .)
## 
## n=3981, 3336 observations deleted due to missingness.
## 
##                   N Observed Expected (O-E)^2/E (O-E)^2/V
## thirtile_res=1 1871      800      725      7.84      16.5
## thirtile_res=3 2110      698      773      7.35      16.5
## 
##  Chisq= 16.5  on 1 degrees of freedom, p= 5e-05

test <- long_dat %>% filter(AGE %in% 60:70)

# also test # people 65 to 75